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ABSTRACT 



Discussion and research concerning class size can be traced 
back at least to the 12th century. An overview of recent research on the 
subject is provided in this report. The paper, which serves as an 
introduction to a symposium on class size, examines research that has 
appeared in the past 20 years, but it concentrates on the results of a 
longitudinal study- -Project STAR (Student Teacher Achievement Ratio) --that 
was considered a controlled experiment for class size research. The results 
of STAR and other similar programs show that students do benefit from smaller 
class sizes, and these results are reinforced by any study that finds a 
positive relationship between tutoring and achievement, cooperative learning 
and positive results, and other programs that emphasize small -group learning. 
Critics have claimed that the studies are in error or that, even if 
effective, such programs are much too expensive to implement. But, it is 
countered, research has not shown the harmful effects of small classes or 
that larger classes are better for children. It is hoped that the research on 
class size will influence educators and policy makers to move forward on this 
issue. (Contains approximately 125 references.) (RJM) 



***************************************************************************** 

* Reproductions supplied by EDRS are the best that can be made 

* from the original document. 



£•/? 



IF NOT BEFORE, AT LEAST NOW 



On 

CN 

ON 

Tf 

Q 

w 



Paper Presented at a Class-Size Symposium ( 4 / 14 / 98 ) 



American Educational Research Association (AERA) 



1998 Annual Meeting, 4 / 13 - 4 / 17, 1998 
San Diego, CA 



By 

C, M. Achilles,* Professor 
Educational Leadership 
Eastern Michigan University 
Ypsilanti, MI 48197 
313-487-0255 (W) 
315-789-2399 (Summers) 
864-963-4789 (H) 



Office of Educational Research and Improvement 

EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 

Hr Th is document has been reproduced as 
received from the person or organization 
originating it. 

□ Minor changes have been made to 
improve reproduction quality. 



Points of view or opinions stated in this 
document do not necessarily represent 
official OERI position or policy. 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL HAS 
BEEN GRANTED BY 




TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 

Other Symposium Participants 1 

Jeremy Finn Alan Krueger 

Michael Kirst Henry Levin 



3/16/98 ® 



’ Achilles was one of the four original Principal Investigators (PI) of the STAR experiment 
(1984-1990), a senior consultant for LBS, and Challenge (1989-1998) and for the DuPont 
Study (1983-1996). He was PI for “Success Starts Small,” an observational study of “Life in a 
Small Class” in grades K-2 (1993-1994) and he is co-investigator of the Burke County, NC 
small- class efforts. 




2 



AERAJ1998.doc 



If Not Before, At Least Now * AERA. 1998 

A Quick Tip-Toe Through Class-Size Antecedents 

Although I am not a historian by training, nevertheless, like many of the 
rest of us, I have lived enough years that the younger generation considers me 
history. My task is to trace briefly the class-size concerns and research as an 
introduction to the serious ideas and papers prepared for this meeting. 

Although according to Angrist and Lavy (1996) the study and use of 
class size regarding student achievement began in the 12 th century when 
Maimonides, the great Rabbinic scholar laid out the principles of class size 
according to concepts presented in the Talmud, for my purposes the present 
emphasis on class size dates from the Glass & Smith (1978) meta-analysis of a 
selection of some earlier studies. The Glass & Smith paper was followed quite 
quickly by two publications from the Education Research Service (ERS, 1978, 
1980), by the publication of an “Experimental study of the effects of class 
size,” (Shapson, Wright, Eason & Fitzgerald, 1980), by the Glass, Cahen, 
Smith, & Filby, book (1982), and a book by Cahen, Filby, McCutcheon & Kyle 
(1983). Except for Shapson et al., this foment for future progress, like many 
other changes in education direction, was essentially built by looking 
backward. The interest was driven by analyses of studies years ago, by 
common sense, and by a growing uneasiness that present-day, generally 
poorly researched education practices will not address current problems. 

While this regeneration of class-size interest was occurring with the 
publication a few studies, journal articles, and books, the State of Indiana was 
quietly launching Project Prime Time (Chase, Mueller & Walden, 1986). 
Although Prime Time had provisions for evaluation, it was primarily a project, 
and not research. It began with the reduction of class sizes in grades 1 and 2 
in selected districts. A local-district option to reduce class sizes in either 
kindergarten (K) or grade 3 was available for the third year. Results generally 
favored small classes, but findings were mixed. (Chase, Mueller, & Walden, 
1986; Mueller, Chase, & Walden 1988). 

At about the same time as the first results were available from 
Prime Time, a small study was begun in two schools in metro-Nashville, TN. 
This study was initiated by Helen Bain who had not long before that served as 
president of the National Education Association (NEA) where one of her main 
interests was to get class sizes to a reasonable level so teachers could teach 
and children could learn. Results of the DuPont Study became available in 
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journal form (Whittington, Bain, & Achilles, 1985; Bain, Achilles, Dennis, 
Parks, & Hooper, 1988), and although small in size, the results were very 
large in impact. 

DuPont results added to Prime Time and to earlier studies, and 
launched a major education experiment. The Tennessee legislature passed 
House Bill 544 which established a state-wide experiment to determine the 
effects of small classes (about 1 teacher and 15 students, or 1:15) on the 
achievement and development of early primary (grades K-3) youngsters. As a 
hedge against possible large costs of small classes, the legislature also wanted 
to know the benefits of a full-time instructional aide in a class of about 22-26 
pupils. 

Project STAR and Its Development 

Project STAR (Student Teacher Achievement Ratio) was a longitudinal 
state-wide randomized experiment . By 1998, more than 11,000 students had 
been tracked on the database if they had been assigned at random to one of 
the three conditions in the study. Those three conditions were a Small class 
(S) of approximately 1:15 (a range of 13-17), a Regular class (R) averaging 
1:25 with a range of 22-26, and a Regular class with a full-time Aide (RA). 
STAR included 79 schools in 46 of Tennessee’s (then) 140 school districts. 
Researchers employed an in-school design to control for building and district 
variables: Any school that had an (S) class also had both other conditions (R, 
RA). Researchers also identified a set of comparison schools (n=21) matched 
closely with the STAR schools. From these schools they collected 
achievement-test data each year when STAR students were tested. Because of 
the parsimony and rigor of the in-school design, little use has yet been made 
of the comparison schools, except for an analysis of differences in random 
and non-random assignments of pupils in (R) classes (Zaharias, 1993; 

Z ah arias, Achilles, Nye, & Cain, 1995; Zaharias, Achilles, & Cain, 1995). 

The STAR researchers followed youngsters who entered kindergarten 
in 1985 (n=6325) until they left grade 3 in 1989. Students were assigned at 
random to class sizes and teachers were assigned at random to classes. 
Students (about 1,200) who did not enter school in K, but did enter in grade 1 
were assigned at random when they entered STAR in 1986. Students stayed 
together each year (cohort), except for student mobility. Teachers were re- 
assigned as the cohorts moved through the grades. Except for random 
assignments and the establishment of S, R, and RA classes, researchers 
changed nothing else in the schools. The four principal investigators (Pis) 
represented four Tennessee universities (Vanderbilt, Tennessee State 
University or TSU, The University of Tennessee or UT, and The University of 
Memphis). There were advisory boards, etc. A research design consultant 
who was external to the study office (Finn) was hired to conduct the primaiy 
STAR analyses. The Pis also analyzed data. 
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STAR ? s Progeny 

STAR cost over 12 million dollars in its first four years and generated 
many other studies, some of which are continuing today. Researchers in the 
Lasting Benefits Study (LBS) have been tracking STAR youngsters to see just 
how long and to what degree the small-class benefits would remain. 

Project Challenge was a policy application of STAR findings. Sixteen of 
Tennessee’s poorest and educationally low-scoring school districts were 
“challenged” to use the STAR findings to improve their student outcomes. If 
they reduced class sizes, the governor provided funds to help those districts. 
Although Challenge was not an experiment, researchers followed the results 
by tracking the average rankings of the Challenge districts among the state 
rankings of districts on student outcomes in reading and math. Districts that 
did use STAR results to reduce class sizes to about 1:15 in grades K-3 moved 
up in the state’s ranking of school districts on grade 2 and grade-3 tests. (Nye 
et al., 1993; Achilles et al., 1995; MosteUer, Light, & Sachs, 1996)). 

Researchers have used STAR’S large database to explore education- 
related questions in many ancillary studies. For example, researchers 
examined such issues as random vs. non-random assignment of students 
using STAR and the comparison schools, school size and class size, class-size 
effects to reduce the achievement gap between minority and non-minority 
students, student behavior and discipline, student participation and 
engagement in schooling, and the unpact of class size on student 
identification with schools. Table 1 lists some STAR-related studies, both 
those using the STAR database and other studies that began as a result of 
STAR findings. 

TABLE 1 ABOUT HERE 

Class size matters. 

The STAR, LBS, and Challenge results were made available each year. 
Finn & Achilles, (1990) discussed the results from STAR’S first two years. 

Many articles, research reports, conference papers, monographs, and ERIC 
entries have followed, in which the authors have discussed STAR results 
and/or the results of ancillaiy studies with language targeted for a number of 
different audiences. (A representative bibliography is included after the 
References). 

Eventually, the STAR findings attracted some attention. Notable here 
were the critical comments of two respected researchers. Orlich (1991) said: 

The study lasted for four years and, in my opinion, is the most 

significant educational research done in the US during the past 

25 years (p. 632). 
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STAR was a tightly controlled, longitudinal, experiment of class size 
Professor Emeritus Mosteller (1995) said about STAR: 

This article briefly summarizes the Tennessee class size project, a 
controlled experiment which is one of the most important 
investigations ever carried out and illustrates the kind and 
magnitude of research needed in the field of education to 
strengthen schools (p. 113). 

Because a controlled education experiment (as distinct from a 
sample survey) of this quality, magnitude, and duration is a 
rarity, it is important that both educators and policy makers have 
access to its statistical information and understand its 
implications, (p. 126). 

Professor Orlich proposed using research results as a base for school 
improvement. Professor Mosteller (1995) and Mosteller, light, and Sachs 
(1996) argued forcefully that STAR and studies similar to STAR in terms of 
design and rigor should be used to inform educational policy decisions. 

Tennessee policy persons made efforts to reduce class sizes K-3 
throughout the state, and by 1994 other states began to follow suit. As class- 
size information became more available, there have been visible uses of it, 
such as in California where there was a voluntary state-wide effort to reduce 
the class size in grades K-3. California’s initiative has been followed closely, 
and coverage of it has appeared in general publications, such as Education ’ 
Week, (E.g., Johnson, 1997), U. S. News and World Renort . Time~ etc! 

As of January, 1998 approximately 27 states either had class-size 
legislation, had debated the topic seriously, or had initiatives to test out the 
impact of class-size reduction for various conditions. Educators and policy 
persons in several foreign countries are considering or are using class-size 
efforts: The Netherlands, England, Australia, Canada. There is some federal 
interest in class-size adjustments, especially in America’s poorest schools. 
(See President Clinton’s 1998 State of the union message). 

Thus, from fairly small beginnings in about 1978-1980, it’s taken 
approximately 20 years for class size to be considered seriously, and about 10 
years for results of one education experiment (STAR) to get into general and 
relatively wide-spread use in American education. This is evidence of 
lethargy among educators and neglect of adults for the well being of youths. 

Some Contentious ness in Using Class-Size Results 

Uses of STAR findings have generated predictable controversy in the 
literature and among researchers, politicians, and policy folks. Some people 
have said that there may be more efficient ways to improve student 
achievement. There are claims about the lack of efficiency of reducing class 
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size in the early grades. This amazing assault on serious longitudinal, 
replicable research is based on little but speculation -- Shakespeare might 
have said “sound and fury.” For example, how can one really understand the 
“efficiency” of reducing class sizes until there are enough small-class activities 
around for serious study of them? 

What research has shown harmful effects of small classes, or that 
larger classes are better for children? What successful education project or 
intervention does not rely on a small-class effect? Research on tutoring and 
on cooperative learning should be considered class-size research. Many 
alternatives to regular public education build upon a small-class effect: Home 
schooling, alternative schools, charter schools, expensive private schools, 
apprenticeships. 

Rather, the anti-class-size literature has been full of hypothetical 
discussions of how something else (we’re not quite sure what that is) might be 
a better way to get at the same achievement and behavior questions that we’re 
getting with the stream of class-size work. How much of the success of some 
popular projects and remedies should be attributed to small classes: Reading 
Recovery (RR), Success for All (SFA), and others? 

This Symposi um as a Start on Heuristics and Systemips 

Thus, we come to a symposium today where we’ll consider some class 
size research and activities. We’ll also review not just what’s been going on, 
and some of the achievement and development findings, but we’ll begin to 
consider some of the emerging economic analyses of class size which should 
NOT be confused with Pupil-Teacher Ratio, or PTR. (Achilles, 1997; Achilles, 
Sharp, & Nye, 1998; Lewis & Baker, 1997; US Department of Education, 1996 
and 1997). Class size is the number of children in a class for whom the 
teacher is responsible; PTR is the number of children at a site divided by the 
number of professional educators there. Class size influences student 
outcome positively (e.g., Finn & Achilles, 1990; Robinson, 1990; Wenglinsky, 
1997) and PTR doesn’t (e.g., Boozer & Rouse, 1995). 

We’ll surely have more of these discussions, as in the later years of 
STAR (STAR pupils are now mostly in grade 12) we’re now beginning to 
understand the long-term effects of early (S) education on later student 
behavior (e.g., Bain, et al., 1997); issues of student drop-out or retention in 
grade; the “trade-offs” in various implementations as policy analysis research; 
and some heuristics involving space use, etc. 

In the research emphasis on class size, the teacher aide (RA) question 
has not been fully examined, but it can be since STAR’S design could as easily 
make STAR an experimental study of RA effects. It is noteworthy here that of 
the three STAR conditions, (S) was best, generally followed by (R) and then 
(RA). This finding may help explain some of the mixed results in Prime Time. 
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(Chase, Mueller, & Walden, 1986) and the continuing run of poor evaluations 

of one aide-loaded federal education policy, Title I. 

As added answers to our questions become available, we shall have 
much more definitive information about whether or not reducing class size is 
“efficient.” We believe that the STAR-generated stream of class-size research 
has answered the question about the effectiveness of early small-class 
interventions. We’re not sure what the relationship should be between 
efficiency and effectiveness when we’re talking about people, and particularly 
about the very youngest people who are beginning their long trek through our 
education system. Benjamin Bloom (1984 a&b) asked that educators seek 
answers to his “2-sigma problem” and “search for methods of group 
instruction as effective as one-to-one tutoring.” Appropriate-sized classes in 
K-3 are a start: they offer Quality (higher achievement), Equality (all 
participants get the same), and Equity (minority and hard-to-teach youngsters 
benefit more). (Achilles, Finn, & Bain, 1997-98; Finn & Achilles 1990- 
Robinson, 1990; Wenglinsky, 1997, etc.). 

Critical discussion and debates about class-size processes have been 
initiated by economists, policy folks, and statistical types, such as Burtless 
(1996), Card & Krueger (1996), Hanushek (1995, 1996), Hedges and others 
(1994, 1996). A recent wave of added interest in the economics of class-size 
processes and outcomes is evident in the work of Angrist and Lavy (1996) 
Boozer and Rouse (1995), Correa (1993), Krueger (1997), and Wenglinsky’ 
(1997). Soon we might expect to see class-size connected to space usage 
(proxemics) and the possibility that crowding little children may contribute to 
later difficult behavior, such as the onset and nurturing of gangs in schools, or 
that large classes add to stale air that adds to teacher fatigue and student 
inattentiveness late in the school day. What are the implications of (S) for use 
of time and technology? For improved school-home relationships? For 
innovative use of space and personnel? How does early schooling in small 
classes extend recent findings of brain research, cognitive psychology 
neuroscience? 

If we ve not had really serious discussions on class size issues and 
implications before, at least let’s get serious about a research-driven base for 
major policy shifts in American education. We know what to do to improve 
early schooling for children. How to do what research shows should be done 
is a fair question for enlightened policy discussions, political decisions, 
educational leadership and a new series of education studies Time is ’ 
wasting. Let’s start. NOW? 
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Table 1. Samples of Studies Derived from and Building upon STAR, 
Gassed as “Subsidiary” (directly from STAR), “Ancillary” (building on STAR 
database) and “Related” (usually involving STAR researchers). 



CATEGORY. TITLE & PTJRPOSF * 



STAR (Many sources) 



Subsidiary Studies 

• Lasting Benefits Study 

• Project Challenge (TN) 

• Participation in Grades 4, 8 



• Follow-up of STAR students 



DATE (Si 



AUTHORISE OR 
PUBLICATION DATF. 



1985-1989 



Word, et al., 1991 
Finn & Achilles, 1990 



1989-Present 
1989-Present 
1990, 1996 

1996-1998 



Nye et al., 1991-1996 
Nye etal., 1991-1996 

Finn, 1989, 1993; Voelkl, 1995 
Finn, et al., 1989, 1990 
Finn and Cox, 1992 
HEROS (1997) 



Ancillary Studies (Use or extend 
STAR. Some dissertations.) 

• Retention in Grade 

• Achievement Gap 

• Value of K in Classes of Varying 
Sizes (test scores) 

• School Size and Class-Size Issues 

• Random v. Non-Random Pupil 
Assignment and Achievement 

• Class Size and Discipline in 
Grades 3,5,7 

• Outstanding Teacher Analysis 
(top 10% of STAR teachers) 



1994 

1993-1995 

1985-1989 

1985-1989 

1985-1989 

1989, 1991, 
1996, etc. 

1985-1989 



Harvey, 1994 
Bingham, 1993 
Achilles, Nye, Bain 

Nye, K., 1995 
Zaharias, etal., 1995 

Several studies. 
Hibbs (1996). 

Bain et al., 1992 



Related Studies 

• Success Starts Small: Grade 1 in 
Chapter 1 (1:14, 1:23) Schools 

• Burke Co., NC Study 

• Education Production Functions 



1993-1995 

1992-1998 

1996-1997 



Achilles et al., 1995 

Achilles et al., 1994 
Krueger, A. B. (1997) 



This list is not complete. It provides samples of the types of studies done. Not all 
authors appear in the references in the exact way listed here. This table appears in 

several STAR reports in substantially this same form. For a list of all references, see 
Achilles (1996b). 
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This section contains both citations in text (References) and selected 
references to class-size studies and supporting data (Bibliography). 
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